An Automatic Approach for Translating Simple Images into Text Descriptions and Speech for Visually Impaired People

نویسندگان

Mrunmayee Patil

Ramesh Kagalkar

Girish Kulkarni

Visruth Premraj

Vicente Ordonez

Sagnik Dhar

Siming Li

Yejin Choi

Alexander C. Berg

Tamara L. Berg

Benjamin Z. Yao

Xiong Yang

Liang Lin

Mun Wai Lee

Song-Chun Zhu

Fan-Chieh Cheng

Shih-Chia Huang

JAMES Z. WANG

Munawar Hayat

Mohammed Bennamoun

Mina Makar

Sam S. Tsai

David Chen

چکیده

Image processing is a rapidly growing field of research. Images are of different file formats and of different things, places, humans, scientific, astrological and many such. An image is a collection of several pixels arranged in rows and columns. These images are captured, processed and stored for various uses. For common people it is very easy to identify and analyze general images but for the blind and physically disabled people it is difficult. Unfortunately, there is no prior medium or interface for such needy people to communicate with the world. Blind or visually impaired people are usually those people who are neglected by the society, so there is always a need to help such people. Hence, we propose a new technique of converting images into text as well as speech using techniques provided by image processing like pre-processing, image segmentation, edge detection, object detection and speech synthesis. In this paper we first introduce image to text conversion need for blind people and system overview of image to text and speech conversion system. Edge detection plays an important role in this system where Canny edge detection algorithm is used to detect objects from images. Object recognition is done on the basis of color, size, texture and shape of the object.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haptic Browser: A Haptic Environment to Access HTML Pages

The application presented in this paper aims at producing a novel user-friendly haptic environment to allow blind or visually impaired people to access interactive presentations based on HTML web pages. The application is based on haptic and audio feedback. Additionally, an automatic HTML-to-haptics conversion tool is developed in order provide a simple way to create interactive haptic presenta...

متن کامل

From Image to XML: Monitoring a Page Layout Analysis Approach for the Visually Impaired

Page layout analysis and the creation of an XML document from a document image are useful for many applications including the preservation of archived documents, robust electronic access to printed documents, and access to print materials by the visually impaired. In this paper, the authors describe a document image process pipeline comprised of techniques for the identification of article head...

متن کامل

Audiodescription research: state of the art and beyond

Audiodescription (AD) is a growing arts and media access service for visually impaired people. As a practice rooted in intermodal mediation, i.e. 'translating' visual images into verbal descriptions, it is in urgent need of interdisciplinary research-led grounding. Seeking to stimulate further research in this field, this paper aims to discuss the major dimensions of AD, give an overview of com...

متن کامل

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

Kannada Text Extraction from Images and Videos Forvision Impaired Persons

We propose a system that reads the Kannada text encountered in natural scenes with the aim to provide assistance to the visually impaired persons of Karnataka state. This paper describes the system design and standard deviation based Kannada text extraction method. The proposed system contain three main stages text extraction, text recognition and speech synthesis. This paper concentrated on te...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

An Automatic Approach for Translating Simple Images into Text Descriptions and Speech for Visually Impaired People

نویسندگان

چکیده

منابع مشابه

Haptic Browser: A Haptic Environment to Access HTML Pages

From Image to XML: Monitoring a Page Layout Analysis Approach for the Visually Impaired

Audiodescription research: state of the art and beyond

Improvement of generative adversarial networks for automatic text-to-image generation

Kannada Text Extraction from Images and Videos Forvision Impaired Persons

عنوان ژورنال:

اشتراک گذاری